Group 2

Introduction

Data sourced from paper:

“Apoptosis and other immune biomarkers predict influenza vaccine responsiveness”

Focus of current project:

  • Clean and augment data for analysis

  • PCA

  • Age analysis on antibody response to vaccine

  • Prediction of vaccine response based on Probes signal

Flow chart

Data description

numeric_summary <- analysis_data |>
  group_by(Age_Group) |>
  summarize(Age_Range = str_c(
      round(median(Age, na.rm = TRUE), 1), 
      " (", 
      min(Age, na.rm = TRUE), 
      "–", 
      max(Age, na.rm = TRUE), 
      ")"),
    BMI_Range = str_c(
      round(median(BMI, na.rm = TRUE), 1), 
      " (", 
      min(BMI, na.rm = TRUE), 
      "–", 
      max(BMI, na.rm = TRUE), 
      ")"),
    .groups = 'drop')
# A tibble: 2 × 3
  Age_Group Age_Range  BMI_Range       
  <chr>     <chr>      <chr>           
1 Older     78 (61–93) 25.1 (18–47.3)  
2 Young     24 (20–30) 22.9 (18.8–43.6)
gender_table <- get_categorical_summary(analysis_data, Gender) |> 
  mutate(Variable = "Gender")
cmv_table <- get_categorical_summary(analysis_data, Cytomegalovirus) |> 
  mutate(Variable = "Cytomegalovirus")
ebv_table <- get_categorical_summary(analysis_data, EpsteinBarrvirus) |> 
  mutate(Variable = "EpsteinBarrvirus")

categorical_summary <- bind_rows(gender_table, cmv_table, ebv_table)

final_categorical_table <- categorical_summary |>
  pivot_wider(names_from = Age_Group,
              values_from = Value,
              id_cols = c(Variable, Characteristic),
              names_sort = TRUE)
# A tibble: 6 × 4
  Variable         Characteristic Older    Young   
  <chr>            <chr>          <chr>    <chr>   
1 Gender           Female         40 (67%) 14 (48%)
2 Gender           Male           20 (33%) 15 (52%)
3 Cytomegalovirus  Negative       24 (40%) 13 (45%)
4 Cytomegalovirus  Positive       36 (60%) 16 (55%)
5 EpsteinBarrvirus Negative       19 (32%) 13 (45%)
6 EpsteinBarrvirus Positive       41 (68%) 16 (55%)

Plots from paper

PCA

Biomarkers for prediction of response

Based on the plot, we observe that the most differentiated probes are ILMN_1688780 and ILMN_1739792

To find the most significant Probe that could potentially predict vaccine response we create a Boxplot for the two Probes observed previously.

ILMN_1688780 has the most clear and non-overlapping difference in the distribution of pre-vaccine expression between the two groups.